Data-Free Quantization

Our lab explores data-free model compression with mutual knowledge transfer.

Mutual Knowledge Transfer for Data-Free Quantization

This research proposes a novel Mutual Knowledge Transfer (MKT) framework for Data-Free Quantization (DFQ), where full-precision (FP) and quantized (Q) models exchange knowledge without original training data. MKT improves previous DFQ techniques by enhancing both layer-wise and class-wise bidirectional knowledge transfer.

MKT Framework

Fig 1. Overall structure of Mutual Knowledge Transfer (source: Paper Figure 1)

Key Components of MKT:

  • 1. Local/Global Correlation-based Alignment (LCA/GCA): Adjusts feature similarity between FP and Q models using Multi-Head Attention to capture both intra-layer (LCA) and inter-layer (GCA) dependencies.
  • 2. Class-Aware BNS Alignment: Learns class-specific embedding weights to reflect class-channel dependencies in batch normalization statistics.
  • 3. Synthetic Sample Generation: Optimizes noise-based data via class-guided batch normalization loss and hard-sample-aware cross-entropy to simulate training data.

Training Procedure:

  1. Step 1: Initialize synthetic noise and class labels → Optimize them with class-aware BNS loss.
  2. Step 2: Generate Q model from FP → Train using quantization-aware training (QAT) with MKT loss including LCA, GCA, and KL-divergence.
Quantization Results

Fig 2. Comparison of MKT with other state-of-the-art DFQ methods (source: Paper Table 1)

MKT outperforms prior DFQ methods like GDFQ, ZeroQ, HAST, and TexQ in CIFAR-10/100 and ImageNet benchmarks—achieving Top-1 accuracy of 57.24% on CIFAR-100 using ResNet-20 at 3w3a quantization, proving its robustness in ultra-low bit settings.